Throughout this document, hover over the numbered annotations to the right of code chunks to reveal detailed explanations and comments about the code. Where drop-down italicized text is present, expand by pressing on arrow to see code.
Data Importation
Data Sources
Procedure
Step 1: Efficiently install packages and load libraries
create_vector_file_paths <-function(directory_path) {# List all files in the given directory path files_to_import <- fs::dir_ls(path = directory_path)# Loop through the files and print each with an indexfor (i inseq_along(files_to_import)) {cat(i, "= ", files_to_import[i], "\n") }# Return the vector of file pathsreturn(files_to_import)}files_to_import <-create_vector_file_paths("data/raw")
The @iteratively-import-raw-data code chunk should only be ran once when raw data is updated because it takes long to execute. Therefore, run the @efficiently-load-raw-data code chunk instead to easily import up-to-date raw data.
Step 3: Use the purrr::map() function to iteratively import files in the files_to_import vector except for the profiles data and .RData files
Refer to the output of the files_to_import data object to ensure you are inputting the correct index value corresponding to the file path that needs to be loaded.
Step 4: Efficiently import up-to-date raw data
base::load(files_to_import[10])
We will always use snakecase when naming our data objects and functions (e.g., data_object_name or function_name()).
Iterate the export_to_csv(df, df_name, dir_path) function over each dataframe. .x refers to the dataframe. .y refers to the name of the dataframe. These are passed to export_to_csv() function along with the desired directory path.
Export merged final data set into data/outputs folder